Tabular Data Cleaning and Linked Data Generation with Grafterizer
نویسندگان
چکیده
Over the past several years the amount of published open data has increased significantly. The majority of this is tabular data, that requires powerful and flexible approaches for data cleaning and preparation in order to convert it into Linked Data. This paper introduces Grafterizer – a software framework developed to support data workers and data developers in the process of converting raw tabular data into linked data. Its main components include Grafter, a powerful software library and DSL for data cleaning and RDF-ization, and Grafterizer, a user interface for interactive specification of data transformations along with a back-end for management and execution of data transformations. The proposed demonstration will focus on Grafterizer’s powerful features for data cleaning and RDF-ization in a scenario using data about the risk of failure of transport infrastructure components due to natural hazards.
منابع مشابه
Thermodynamic and economic comparison of photovoltaic electricity generation with and without self-cleaning photovoltaic panels
In this study, thermodynamic and economic analysis of a photovoltaic electricity generation system (PVEGS) with and without self-cleaning panels is reported. In the first part, thermodynamic analyses are used to characterize the performance of the system. In the second part, the economic comparison of photovoltaic electricity generation with and without self-cleaning panels is carried out for a...
متن کاملIncreasing Quality of Austrian Open Data by Linking them to Linked Data Sources: Lessons Learned
One of the goals of the ADEQUATe project is to improve the quality of the (tabular) open data being published at two Austrian open data portals by leveraging these tabular data to Linked Data, i. e., (1) classifying columns using Linked Data vocabularies, (2) linking cell values against Linked Data entities, and (3) discovering relations in the data by searching for evidences of such relations ...
متن کاملSolving the Cell Suppression Problem on Tabular Data with Linear Constraints
Cell suppression is a widely used technique for protecting sensitive information in statistical data presented in tabular form. Previous works on the subject mainly concentrate on 2and 3-dimensional tables whose entries are subject to marginal totals. In this paper we address the problem of protecting sensitive data in a statistical table whose entries are linked by a generic system of linear c...
متن کاملCrowd-Sourcing the Large-Scale Semantic Mapping of Tabular Data
Governments and public administrations started recently to publish large amounts of structured data on the Web, mostly in the form of tabular data such as CSV files or Excel sheets. Various tools and projects have been launched aiming at facilitating the lifting of tabular data to reach semantically structured and linked data. However, none of these tools supported a truly incremental, pay-as-y...
متن کاملLightweight Transformation of Tabular Open Data to RDF
Currently, most Open Government Data portals mainly offer data in tabular formats. These lack the benefits of Linked Data, expressed in RDF graphs. In this paper, we propose a fast and simple semi-automatic tabular-to-RDF mapping approach. We introduce an efficient transformation algorithm for finding optimal relations between columns based on ontology information. We deal with multilingual div...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2016